giCentre, City University London - GeoProcessing

VAST 2009 Challenge
Challenge 2 - Social Network and Geospatial

Authors and Affilliations:

Jo Wood, the giCentre, City University London, jwo@soi.city.ac.uk [PRIMARY contact]
Aidan Slingsby, the giCentre, City University London, a.slingsby@soi.city.ac.uk
Naz Khalili-Shavarini, the giCentre, City University London, nazanin.khalili-shavarini.1@city.ac.uk
Jason Dykes, the giCentre, City University London, jad7@soi.city.ac.uk
David Mountain, the giCentre, City University London, dmm@soi.city.ac.uk

Tool(s):

The main tools used were written in Processing (www.processing.org) specifically for this challenge. We regard Processing itself (effectively a set of libraries and IDE for use with Java) as a tool as it enables rapid prototyping and development of visualization software.

Three applications were developed within the first week of the group starting the challenge:

We also used the following:

Video:

Get Flash to see this player.

See also the higher quality video: movie.mp4



ANSWERS:


MC2.1: Which of the two social structures, A or B, most closely match the scenario you have identified in the data?

A (one middleman, three handlers)




MC2.2: Provide the social network structure you have identified as a tab delimitated file.

See Flitter.txt




MC2.3: Characterize the difference between your social network and the closest social structure you selected (A or B). If you include extra nodes please explain how they fit in to your scenario or analysis.

The core network (employee-handlers-middleman-leader) conforms with Scenario A (see Figure 2.3a), the employee (Schaffter) having 40 contacts; the handlers (Kushnir, Pettersson and Reitenspies) having 31, 37 and 33 contacts respectively; the handlers being unconnected to each other but sharing a common middleman (Good) who has 5 contacts; the Fearless Leader (Szemeredi) having a broad Flitter network of 256 contacts spanning all four countries.

Our suspected network is extended to account for international links to the criminal network. Szemeredi has a total of 14 direct international contacts in all three bordering countries. However, we suspect there is a further indirect international link to Posana via three 'international middlemen' (Kroon, Krupp and Quisquater) based in Koul. Each has a contact in Otello forming a 'fledgling Scenario B'. Evidence to support this is that these international middlemen are based in a single large city (giving the Fearless leader a "presence in a larger city"), each with a small number of contacts (5, 6 and 5 respectively). They each have a contact in Otello (Aalberg, Cacciabue, Milberg) with a larger number of contacts, but who are not connected to one another. This network differs from Scenario B in two respects. Firstly, the middlemen have a few more contacts than the suspected "one or two others". This could be explained by their possible middleman roles in other countries (e.g. Kroon-Minka in Transak). Secondly, due to the small number of members in bordering countries, we suspect that the criminal networks there may be less advanced than in Flovania, possibly relying on other forms of communication.

Network view of criminal network
Figure 2.3a Suspected criminal network (Screenshot from NetView; legend superimposed).

A team of five worked on this challenge. We agreed to spend the first three weeks working on the problem individually, deliberately not sharing any data, results or conclusions. This maximised the chances of spotting any blunders, unjustified assumptions or inferences and allowed us to triangulate any common conclusions. After three weeks we shared our results demonstrating the visualization applications we had built and the reasoning behind our conclusions. Independently, three of the investigators came to a common conclusion that the core network conformed with Scenario A comprising (Szemeredi-Good, Kushnir, Pettersson, Reitenspies-Schaffter). One investigator suggested that neither scenarios A or B were possible based on a stricter reading of "presence in a larger city" for the fearless leader. This sharing of assumptions encouraged us to incorporate uncertainty in our visualization approaches. Programming the applications (all in Java/Processing) varied between about 12-36 hours of programming time.

Initial work concentrated on summarising the network characteristics to see if that suggested any filtering that could be applied (e.g. unconnected trees in the network). City locations were digitized using LandSerf and base mapping redrafted to make geographic interpretation easier (see Figure 2.3b). Coordinates of city locations were added to a normalised MySQL database holding tables People, Cities and Links.

Flitter flow map
Legend
Figure 2.3b Flitter connections with redrafted basemapping (screenshot from FlowMappa; legend added).

Measured network statistics revealed all 6000 nodes were part of a single connected tree with a radius of 3 and diameter of 5. This suggested the network was well connected and there were no significantly remote sub-networks. Node eccentricity and betweenness were calculated for each person, but due to the well-connected nature of the graph, showed very little of use to identifying the criminal network. Frequency distribution of node degree (number of contacts per person) showed a log-normal distribution with a modal value of 5 contacts per person. The fact that no one had fewer than 4 contacts would suggest that Scenario B would not be possible within the network. However to account for possible uncertainty in our data and criminal network intelligence, this possibility was not ruled out entirely.

The NetView application represented each person as an ellipse coloured according to its suspected role. Roles were defined purely on the number of contacts and took into account the uncertainty in their definition (e.g. "handlers probably [have] between 30 to 40 Flitter contacts"). Ellipses were sized according to the certainty of their role(s) - the further from the optimal number of contacts, the smaller the ellipse. Colouring used an alpha opacity of 50% so that multiple roles could be assigned to a single person.

Edges between people were drawn if they could form part of a criminal network (e.g. lines showing contacts between potential Fearless Leaders and Middlemen, but not between Leaders and Handlers). People were positioned according to user-selected rules: (i) by geography of their home city; (ii) by role; (iii) using spring embedding that grouped nodes by role; (iv) manual user positioning. Mousing over any person provided more details about them and their contacts. People and edges could be selected, highlighted and grouped to allow exploration of hypotheses.

NetView showing people grouped by role and functional connectivity
Figure 2.3c Screenshot from NetView showing spring-embedded positioning of people grouped by role and functional connectivity.

Figure 2.3c shows all people assigned a role positioned using spring embedding. Middlemen to the left of the Fearless Leaders could not be part of the criminal network as they aren't 'pulled' towards the handlers and employees to the right. Iterative filtering of the network could be applied by the user prompting the automatic elimination of people who do not conform with the connections of scenario A or B.

Filtering led to three candidate networks as shown in Figure 2.3d. With edges drawn, any connections between handlers in the same criminal network were automatically highlighted with a curved line allowing the elimination of the Inenaga network as it contradicts the assumption that "Boris [...] does not allow [handlers] to communicate among themselves". The Cornell network was eliminated because the smaller size of the Rosch node suggested they had more than the 5 contacts permitted for a Handler. This was confirmed by using the mouse 'tooltip' providing details of Rosch's connections and geography (not located in a "smaller city").

NetView showing possible criminal networks
Figure 2.3d Screenshot from NetView showing three candidate criminal networks with mouseover 'tooltip' providing detail on Rosch. Connections between handlers automatically highlighted (ruling out the Inenaga network). Szemeredi network highlighted by user as part of the hypothesis exploration process.

International contacts were explored by limiting visualization to a single country with the addition of Szemeredi identified from the previous stage. All national connections were shown as lines with those connected to Szemeredi highlighted (see Figure 2.3e). Visual inspection of edges showed that no complete graph representing either Scenario A or B existed in any of the bordering countries. Nor did they show any similar topology connected directly to Szemeredi. This suggested futher people must be involved in the international network. Our assumption was that these people would have some kind of middleman role and be located in a large city (Szemeredi's "presence in a larger city"). So all large city people with fewer than 7 contacts were added to the national networks revealing a fledgling Scenario B in Posana (see MC2.4 and 2.5).

NetView showing all connections within bordering countries and Szemeredi
Figure 2.3e Screenshots NetView showing all connections within bordering countries and Szemeredi.



MC2.4: How is your hypothesis about the social structure in Part 1 supported by the city locations of Flovania? What part(s), if any, did the role of geographical information play in the social network of part one?

Target (Schaffter) and handlers (Reitenspies, Pettersson, Kuishnir) are based in the same large city (Prounov). This confirms "target may be in a large city" and assists face-to-face interaction between target and handlers. It supports rejection of the Lafouge-(Lonning, Formenti, Krinz) and Supornpaibul-(Gusat, Letelier, Bailey) networks, having protagonists in multiple cities including a smaller one (Sresk).

The location of middleman (Good) in Kanvic is consistent with "middleman might be in nearby smaller location". Kannvic is both a small city and close to Prounov. It supports the rejection of potential middlemen (Cuatto and Rosch) who are located in a larger city (Koul).

Leader (Szemeredi) is less vulnerable to discovery in a small remote city (Ryzkland), but "require[s] a presence in a larger city" via Kroon, Krupp and Quisquater in Koul, fulfilling international contact middlemen roles. The entire criminal network forms a 'northern-axis' comprising Ryzkland, Otello, Kannvic, Prounov and Koul.

Geography of the criminal network
Legend
Figure 2.4a Geographic distribution of the criminal network.



MC2.5: In general, how are the Flitter users dispersed throughout the cities of this challenge? Which of the surrounding countries may have ties to this criminal operation? Why might some be of more significant concern than others?

There are Flitter contacts between all pairs of cities in the study area (24*12/2 = 72 bi-directional city flows). Figure 2.5a (left) shows the majority of contacts are between Koul, Prounov and Kouvnic which also corresponds to cities with the highest numbers of Flitter members. Figure 2.5a (right) shows an OD map of contacts. Each large square represents all contacts associated with a given city. Each smaller coloured square is the number of contacts between the city and every other one. No one bordering country has significanlty more contacts with Flovania, although the majority of connections are towards the north of the region.

OD treemap of all Flitter contacts
Legend
Figure 2.5a Left: Flow map of inter-city Flitter contacts. Number indicates number of Flitter members in each city; Right: OD map showing numbers of contacts between all combinations of cities.

There are direct connections between Fearless Leader and all three bordering countries (see Figure 2.5b). Posana is of greatest concern as these contacts show a fledgling 'Scenario B' structure. Transak has the greatest national connectivity, although this may be simply due to Bates' prolific Flitter behaviour.

International Networks with Large City Related Other providing contact
Figure 2.5b International contacts of Szemeredi and potential large city handlers.